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ABSTRACT 



The rationale and description of tests of teaching 
power by which teachers have an equal chance to show their relative 
ability to effect pupils’ achievement in reading skills are 
discussed. Illustrations of these performance tests and a means for 
administering them - ’’teaching faires” are also presented. Data in 
support of teaching performance tests in reading are reported along 
with information about teachers resistance to such tests. Suggestions 
are made for further work with tests of teaching power. (Author/AG) 
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PERFORMANCE TESTS: ASSESSING TEACHERS OF READING 



CD 

CD 

- 3 - 

LT\ 

CD 

CD 

U_/ 



i* ; 

© " ; 

o . 

•'O . • 

I 



fc-< 



John D. McNeil 

University of California, Los Angeles* 

INTRODUCTION 

We believe that the teacher makes a difference in the child's 
progress in learning to read. Not all teachers are equally able to 
help pupils both achieve mastery of critical skills and develop 
positive attitudes toward reading. Unfortunately, administrators 
vho are charged with responsibility for selecting and improving 
teachers seldom have had access to that information that enables one 
to' recognize those teachers who are indeed superior in teaching. 

Courses taken ,* grades , age, experience, personality as judged through 
interview, observation, and supervisor’s ratings—these and' similar 

i 

kinds of information have been found inadequate for making judgments 
about which of several teachers is the most competent . (Morsh 8= Wilder, 
1953). Admittedly, pupil growth' is the ultimate criterion for .assessin 
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teacher effectiveness. However, it is unsound to rank teachers on this 
criterion when they have not been confronted with a comparable set of 
teaching conditions including factors such as. common instructional 
tasks, teachable children, and time allowed for teaching. The problem, 
therefore, is to design tests of teaching power by which teachers have 
an equal chance to show their relative ability. 



^Special recognition is due V. Downs and U.C.L.A Coordinators 
of Student Teaching (elementary) who participated in the 
design and conduct of "Teaching Fairs," an original • 

contribution to performance testing. 
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GENERAL DESCRIPTION OF PERFORMANCE TESTING 

Performance tests are one answer to the problem of identifying 
"the effective instructor. Typically, performance testing means giving 
a number of teachers identical instructional tasks (objectives) and a 
sample of a measure to be administered to pupils after the teaching 
has occurred. The instructional tactics are left to the teacher. 
Frequently, the objective is novel to both teacher and pupils, thereby 
eliminating major "contamination" from previous exposure to the subject 
matter and aiding in the problem of experimental control. The teachers 
are allowed a specific period of time for planning the lesson(s) and 
for the teaching. Groups of learners are assigned to the teachers as 
pupils. These" learners are drawn from a pommon population and are 
randomly assigned to a group for instruction. . Following the 
instructional period, all pupils are^ assembled to complete a test- 
which measures pupil attainment of the instructional objectives !’ The 

* i , . " 

mean of the test scores earned by pupils taught by a given teacher 
indicates that teacher's standing in ability to teach the predetermined 
-skill or concept. • ' 

"EXAMPLE of performance test 

’(Task one: A test to provide evidence of a teacher 1 3 ability 

to help pupils break a code. Although the "letters" in this teaching 
exercise are artificial, the task is not altogether unlike that of 
recognizing long and short vowel sounds in printed words that follow 

i 

consonant-vowel-consonant and consonant-vowel patterns. No assumption 
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is made however that children who master the contrived task will be 
able to perform wxth conventional letters.) Teaching time 15 minutes. 
Objective: Given a list of written words in code, the pupil will be 

able to circle those which contain a short blin g sound produced by 
GBG pattern. In this exercise, a short, bling sound occurs only when 
the bling is both immediately preceded and immediately followed by a 
glonk. The following are symbols standing for blings: Hi /l/^ 

The following are symbols standing for glonks : 9 g, % zk'A 

0 o ut All vords are made up of glonks and blings. Sample 
test item: Circle the vords vhich have a short bling sound. 

•Tr O A a 
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Post Test: (To be given to pupils. Not 'available to teachers) 

The tester vill read the directions aloud. 

Directions: Circle each vord belov vhich has a short bling' sound. 



l. EU (3 A- 


8. A A C! n? 


2 . z> A & 


9. ? 0 0 0-^ 


3. f -a □ A 


10. A 7^ ^ 


>*. 9 0 A 


ii. f a a 0 


5 . & D ? O * 


12. Q) /xa. d\- 


«. 9 a n 


is. & a A- 


t. A XI 


n. . .a a 

15. d SH, & 



Circle the ansver that tolls hov you feel about the questions: 

A. Do you vant more lessons from the teacher vho taught you the code? Yes No 

B. Do you vant more lessons like this code lesson? Yes No 
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In add.iti.on to the test presented above, three other teaching 
tests have been used with more than 200 teachers in training. These 
tests require the teacher to teach tasks analogous to: 

(a) recognizing new words and then selecting the one word from 
among several that will best complete a sentence composed of 
these. words, a task demanding both skill in word recognition 
and ability to apply structure of language in completing 
sentences. (Test 2) 

(b) indicating the sound value of a given letter when there is a 
single letter £ with two sound values — /c/ and /k/(c_ as in cent; 

£ as in cat), (Test 3) 

(c) determing pronunciation of initial vowels in words by using 
the "silent £ rule." (Test H) * 

Each of the tests was developed in accordance with the following 
guidelines: 

1. The objective or task should be analogous to an important . 
skill in word recognition (validity). 

2. The task should require the learner to apply his learning 
to fresh instances (no teaching to the test). 

3. Evidence should be collected indicating the child* s 
attitude or predisposition toward both teacher and task. 

h . The task should be complex enough to allow teachers to make 
decisions regarding, such matters as reinforcement, pacing, 
relevant and irrelevant practice, identification of prerequisite 
skills, and sequencing. 
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5. The task should be one that pupils with competent help can 
master vithin the time allowed, yet must be difficult enough so 
that it will discriminate among teachers. 

TEST TRYOUTS 

Three populations of teachers have served as subjects: those 

enrolled in a methods course in reading, those completing an initial 
assignment as student teachers, and those finishing a second teaching 
assignment. These teachers were grouped — approximately 20 to a group — 
and directed to a school for participation in a "Teaching Fair. 1 ’ A 
Teaching Fair is akin to traditional fairs where skilled persons enter 
competitive contests, publically displaying their expertise. The fairs 
took place in schools ranging from inner city schools where pupil per- 

i 

formance on standardized tests was among the nation's lowest to wealthy 
suburban schools where reading achievement scores ranked in the top 
tenth percentile on national norms.’ Schools, pupils, and task (in its 
analogous form) were unfamiliar to the teacher. About one hundred twenty 
children at each site from second, third and fourth grades served as the 
learners. Children identified as having exceptional intelligence, 
emotional behavior, and language backgrounds were not included in the 
population taught. All teachers taught at the same time, the teachers 
usually teaching in a common location such as a lunch area. Each 
teacher taught first a group of three children (randomly drawn from 
the pupil population) and then after a fifteen minute recess taught a 
second group of three. Mass testing of pupils followed immediately 
after the lesson and was conducted by independent auditors. Pupil 
responses were corrected and the teacher’s total score was compiled 
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for both correct items and positive responses toward teacher and 
task. Teachers wore then ranked within their groups. The results 
were available for those hiring teachers in the case of student 
teachers seeking employment and for grading purposes in the methods 
course. 

WHAT HAVE WE LEARNED? 

GENERALIZATIONS 

A caveat is in order. The particular tests described in this 
paper do not represent all alternatives possible under the rubric 
performance testing . Illustrations of other formats can be seen in 
the works of Justiz (19&9), Popham (l97l), and Taneman (1970). Test 
developers have many other options such as increasing the number of 
days required for teaching and providing, in addition to objectives 
and sample test items, instructional resources and distractors. With 
respect to the present tests, there are data indicating test validity, 
reliability and practicality. 

1* Validity . The tests are drawn from reading skills generally ^ 
recognized as important in learning to read. (Bee consensus of 
reading skills as determined by Otto and Peterson, ( 1969 ) • It 
is true, however, that the tasks are analogous to the reading 
skills and not identical to them. It is assumed that the teacher 
who can succeed in communicating the key to breaking the artificial 
code can also communicate the key for breaking the conventional 
code. 
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Rc3.iabi3.ity , First session scores of fifteen teachers on 
Test 1 above, were compared with these teacher* s scores from a 
second session. The correlation between the scores was .521, 
significant at the .05 level. Also, correlations between teachers* 
performances on the different tests given ten weeks apart with 
different kinds of pupils were positive. For instance, thirty 
teachers took Test 3 at the end of their methods course and 
completed Test h ten weeks later after a student teaching assign- 
ment. Their scores showed a Pearsonian r of .388 p(<.05). As 
indicated in Table 1 below, one could have made a probable 
prediction about the likelihood of high achieving and low 
achieving teachers (top 25 percent and bottom 25 percent) making 

i 

a similar shoving on a second test weeks later. The chances of 
these teachers maintaining their level were greater than three to 
one. • 



Table 1 

Chi Squares for High and Lov Teacher 
Performance on Tests of Ability to 
Teach Two Different Tasks of Heading 



TASK 3 

13 high 
13 low 

TASK 2 

16 high 
16 low 



TASK U 
High • Low 

9 

X 2 = 4.5 1 * p«.05). h 9 

TASK k 
High Low 

.. 12- U 

' 6 10 

X 2 = U.Uo p«.05) 
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3. Utility. There are three ways in vhich utility has been shown. 
First, employers have stated that in making a decision about vhich 
of several teachers to hire, ■ , ‘ v >e information about a teacher's 
performance on the test relative to his peers is of value along 
vith other kinds of information. Second, some teachers vho did 
not obtain satisfactory results the first time they took such a 
test have been able to study the demands of these tests and to 
analyze their own practice vith respect to these demands, thereby 
improving in their ability to perform. Third, the tests have been 
used as a research tool. Taped records of the teaching carried out 
by high and low scoring teachers have been analyzed and promising 

l 

instructional procedures have been identified. These procedures 

are now being systematically manipulated to verify their importance. 

OPPOSITION TO USE 
% 

Any information vhich might be used to assess a teacher and for 
making decisions about his employability is likely to generate anxiety. 
Consequent resistence to the performance test as an information 
gathering scheme seems to take these forms: 

1. A fev teachers reflect their "equalitarian bag" and minimize 
aptitude. They play dovn the fact that not all teachers are 
equally competent to serve children after a fixed period of 
training. There are those vho state that it is the business of 
the training program to ensure that all teachers succeed. 
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2. There are teacher vho vant to be judged solely on subjective 
if /)c7~ J e «t criteria or on the basis of their efforts, not results produced. 

^ Tit* ! ■ They feel more confident in competing vith their peers on the 
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basis of personality and hard vork. Those vho have had a history 
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— / tr Jl fhe/rf v-W-j, success in winning friends and influencing others probably 
77 hejido • <s / « t . believe they have a better chance of competing for a job on the 

0&i jC (•'UC% ) X 

d ' f basis of the general impression they make on supervisors, 

; ... principals , and interviewers than when forced to compete on the 

/n c< c 

basis of their ability to effect desired changes in learners. 

V. /2eos * Other teachers feel that because they worked hard, even though 

•'7"'* /<vc? 

. sf~ C <■*"’' ' 

: f-t \ t they accomplished little vith children, they are good teachers. 

") 3. Some teachers have claimed that they did not receive equal 
opportunity to succeed on the test. When it can be verified 
that indeed their pupils or situations were riot representative, ■ 
these teachers haVe bee’n given another chance on a different test, 

• Usually teachers begin to question their own performance rather 
than to blame the pupils for the failure when it is shown that 

other teachers get successful results vith the same group of 

--■^pupils on a related task under similar conditions. 

UEXT STEPS 

"Further test development is needed. Variations in test construction 
■should be created and tried out. Also, research’ should be undertaken to 
find out how generalizable the results of performance tests are; e.g. , 
.What is the relation of short fifteen minute performance -scores to 
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semester goals? What is the relation of a teacher's success on classes 
of reading skills in addition to the relation of his success on reading 
tasks vithin a class of skills? Then too, the attitude and role of 
community groups, teachers 1 organizations, and personnel comissions 
with respect to performance tests deserve study. One likely pressure 
in favor of performance testing is the recent Supreme Court decision 
barring discriminatory job testing. This action should result in school 
employers demanding tests that will provide information predictive of 

or correlated with important elements in the teaching of reading 

the job for vhich the candidate is being evaluated. 
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